A Generic Model to Compose Vision Modules for Holistic Scene Understanding

نویسندگان

Congcong Li

Adarsh Kowdle

Ashutosh Saxena

Tsuhan Chen

چکیده

The problem of holistic scene understanding involves many vision tasks such as depth estimation, scene categorization, event categorization, etc. Each of these tasks explores some aspects of the scene but, these tasks are related in that, they represent attributes of the same scene. An intuition is that one task can provide meaningful attributes to aid the learning process of another task. In this work, we propose a generic model (together with learning and inference techniques) for connecting different vision tasks in the form of a 2-layer cascade. Our model considers the first layer as a hidden layer, where the latent variables are inferred by feedback from the second layer. The feedback mechanism allows the first layer classifiers to focus on more important image modes, and draws their output towards “attributes” rather than the original “labels”. Our model also automatically discovers sparse connections between the learned attributes on the first layer and the target task on the second layer. Note that in our model, the same vision tasks can act as attribute learners as well as target tasks, while being set up on different layers. In extensive experiments, we show that the same proposed model improves the performance in all the tasks we consider: single image depth estimation, scene categorization, saliency detection and event categorization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Plane Estimation with Shape Detection for Holistic Scene Understanding

Structural scene understanding is an interconnected process wherein modules for object detection and supporting structure detection need to co-operate in order to extract cross-correlated information, thereby utilizing the maximum possible information rendered by the scene data. Such an inter-linked framework provides a holistic approach to scene understanding, while obtaining the best possible...

متن کامل

Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding

Recent trends in image understanding have pushed for holistic scene understanding models that jointly reason about various tasks such as object detection, scene recognition, shape analysis, contextual reasoning, and local appearance based classifiers. In this work, we are interested in understanding the roles of these different tasks in improved scene understanding, in particular semantic segme...

متن کامل

Knowledge Representation and Inference for Grasp Affordances

Knowledge bases for semantic scene understanding and processing form indispensable components of holistic intelligent computer vision and robotic systems. Specifically, task based grasping requires the use of perception modules that are tied with knowledge representation systems in order to provide optimal solutions. However, most state-of-the-art systems for robotic grasping, such as the KCoPM...

متن کامل

From holistic scene understanding to semantic visual perception: A vision system for mobile robot

Semantic visual perception for knowledge acquisition plays an important role in human cognition, as well as in the many tasks expected to be performed by a cognitive robot. In this paper, we present a vision system designed for indoor mobile robotic systems. Inspired by recent studies on holistic scene understanding, we generate spatial information in the scene by considering plane estimation a...

متن کامل

Automated Scene Understanding for Airport Aprons

This paper presents a complete visual surveillance system for automatic scene interpretation of airport aprons. The system comprises two main modules — Scene Tracking and Scene Understanding. The Scene Tracking module is responsible for detecting, tracking and classifying the semantic objects within the scene using computer vision. The Scene Understanding module performs high level interpretati...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

A Generic Model to Compose Vision Modules for Holistic Scene Understanding

نویسندگان

چکیده

منابع مشابه

Combining Plane Estimation with Shape Detection for Holistic Scene Understanding

Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding

Knowledge Representation and Inference for Grasp Affordances

From holistic scene understanding to semantic visual perception: A vision system for mobile robot

Automated Scene Understanding for Airport Aprons

عنوان ژورنال:

اشتراک گذاری